Skip to main content

All Questions

Tagged with
0votes
1answer
23views

Struggling with normalization/Standardisation for machine learning dataset

Sorry for what is probably a very obvious/rookie question. I'm currently doing a data science module for my degree and making very slow progress with the work. The case study i'm doing is around HR ...
Alex Ferry's user avatar
1vote
0answers
152views

Generating quality synthetic tabular data - is it possible when one's dataset is extremely small?

I've got a dataset consisting of only 17 samples and 6 continuous features (all values in the dataset contain decimals, although 2 features exhibit categorical-ish behaviour). I'm looking at the ...
user23493275's user avatar
0votes
1answer
28views

how to build model using two input dataset in which there is no common column to merge or combine

I want to create model for truck company in which trucks delivers the car for customers.i have two data sets. one is customer details like how many cars they want from particular area or terminal and ...
prema's user avatar
0votes
1answer
226views

different range of target values in neural network

I am working on a neural network regression code. The dataset includes 14 features in the range value between -1 and 1. while the target variable is changing among (0.000759) to (1100). The target ...
Mali's user avatar
1vote
1answer
139views

Input shape - How to feed metadata to a ML model?

I have data such as metadata: hospital layout, number of rooms, number of patients in a day etc. and then I have data regarding the doctor’s check-ins. Which is more granular. How do I feed this data ...
StephM's user avatar
0votes
1answer
55views

Transform dataset to regression problem by sorting?

I have a raw unlabeled dataset, and I want to design a model to perform a regression. In my dataset, it does not make sense to give each observation a value, but it does make sense to sort them. Can I ...
Giuliano Mirabella's user avatar
0votes
0answers
70views

How we describe scientifically easy and difficult dataset?

Sometimes we hear that a dataset is easy or difficult for a same task (like classifying cats and dogs). How we can describe it in scientific way( math or probability or statistics). For example we ...
Mahdi Amrollahi's user avatar
1vote
0answers
19views

Are there any known methods to generalize multiple trajectories into one "optimal" path based on energy consumption?

Say that I have a database with timeseries coordinate data from a vehicle going from A to B multiple times but with slightly varying trajectories each time, leading to different amounts of energy ...
Max Månsson's user avatar
0votes
1answer
144views

Difference between scaling just x or x and y in PCA / principle component regresseion

Before doing principle component regression it is important to scale the data. But which data exactly? Is it enough if I just scale X or do I have to scale the whole data set, containing X and Y (=...
Sally's user avatar
2votes
1answer
43views

Given a regression based model with many feature variables; what tools would you utilize to figure out which feature variables add the most variance?

Given a hypothetical dataset {S} with 100 X feature variables and 10 predicted Y variables. X1 ... X100 Y1 .... Y10 1 .. 2 3 .. 4 4 .. 3 2 .. 1 Let's say I want to improve the accuracy of Y1. I am ...
Sad CRUD Developer's user avatar
1vote
1answer
23views

Material Science dataset with feature-dependent inputs

I'm dealing with a material science/chemistry dataset where I have a bunch of duplicates inputs formulas corresponding to different values of a specific features like temperature. It looks something ...
James Arten's user avatar
1vote
1answer
20views

Sum of squares for matrix valued data over $\mathbb{R}$ and $\mathbb{C}$

Let us assume we have $k \times k$ matrix valued data and assume this is organized (possibly as time series): $$ M_1, M_2, \ldots, M_n $$ Now, assume we are interested in writing down an error ...
Marion's user avatar
0votes
1answer
50views

Creating a training dataset from analytical solution

I am currently redesigning an inverse problem on an experimental technique, but I am having doubts about how to create a training dataset. Here is the problem I am trying to solve: I have already ...
mechanics_physics's user avatar
0votes
0answers
76views

My validation set losses are static

My dataset looks something like this: ...
Mr. Johnny Doe's user avatar
4votes
2answers
207views

Was this dataset analysed correctly? [closed]

There's a Twitter thread going around that claims there are signs of voter fraud due to anomalies in the election vote count data set. You can download the dataset here and find the script to generate ...
JansthcirlU's user avatar

153050per page
close